Accurate prediction of enzyme mutant activity based on a multibody statistical potential
نویسندگان
چکیده
MOTIVATION An important area of research in biochemistry and molecular biology focuses on characterization of enzyme mutants. However, synthesis and analysis of experimental mutants is time consuming and expensive. We describe a machine-learning approach for inferring the activity levels of all unexplored single point mutants of an enzyme, based on a training set of such mutants with experimentally measured activity. RESULTS Based on a Delaunay tessellation-derived four-body statistical potential function, a perturbation vector measuring environmental changes relative to wild type (wt) at every residue position uniquely characterizes each enzyme mutant for model development and prediction. First, a measure of model performance utilizing area (AUC) under the receiver operating characteristic (ROC) curve surpasses 0.83 and 0.77 for data sets of experimental HIV-1 protease and T4 lysozyme mutants, respectively. Additionally, a novel method is introduced for evaluating statistical significance associated with the number of correct test set predictions obtained from a trained model. Third, 100 stratified random splits of the protease and T4 lysozyme mutant data sets into training and test sets achieve 77.0% and 80.8% mean accuracy, respectively. Next, protease and T4 lysozyme models trained with experimental mutants are used to predict activity levels for all remaining mutants; a subsequent search for publications reporting on dozens of these test mutants reveals that experimental results are matched by 79% and 86% of predictions, respectively. Finally, learning curves for each mutant enzyme system indicate the influence of training set size on model performance. AVAILABILITY Prediction databases at http://proteins.gmu.edu/automute/
منابع مشابه
QSARS OF ANTI-FUNGAL ACTIVITY OF FURAN CARBOXANILIDE DERIVATIVES AGAINST WILD AND MUTANT STRAINS OF USTILAGO MAYDIS
The structural requirements for the inhibitor activity of various furan carboxanilide derivatives against succinate dehydrogenase complex (SDC) activity in mitochondria of either wild or mutant strains of Ustilago maydis were investigated with the aid of Hansch QSAR analysis. It has been found that the inhibitor activity against both types of enzymes is best related to the ??? or ??M of th...
متن کاملThe Comparison of Direct and Indirect Optimization Techniques in Equilibrium Analysis of Multibody Dynamic Systems
The present paper describes a set of procedures for the solution of nonlinear static-equilibrium problems in the complex multibody mechanical systems. To find the equilibrium position of the system, five optimization techniques are used to minimize the total potential energy of the system. Comparisons are made between these techniques. A computer program is developed to evaluate the equality co...
متن کاملUV mutagenesis for the overproduction of xylanase from Bacillus mojavensis PTCC 1723 and optimization of the production condition
Objective(s):[p1] This study highlights xylanase overproduction from Bacillus mojavensis via UV mutagenesis and optimization of the production process. Materials and Methods:Bacillus mojavenis PTCC 1723 underwent UV radiation. Mutants’ primary screening was based on the enhanced Hollow Zone Diameter/ Colony Diameter Ration (H/C ratios) of the colonies in comparison with the wild strain on Xyla...
متن کاملIdentification and Functional Characterization of Arabidopsis icl Mutant Under Trehalose Feeding in Light and Dark Conditions
Trehalose is a non-reducing sugar that plays an important role in plant growth and development. To study the role of trehalose on lipid metabolism and gluconeogenesis, Arabidopsis thaliana wild type (WT) and TreF (a line expressing trehalase) were grown on ½ MS medium with or without 100 mM sucrose and or trehalose in light or continuous darkness. In dark, trehalose leads skotomorphoge...
متن کاملStructure-based prediction of transcription factor binding specificity using an integrative energy function
UNLABELLED Transcription factors (TFs) regulate gene expression through binding to specific target DNA sites. Accurate annotation of transcription factor binding sites (TFBSs) at genome scale represents an essential step toward our understanding of gene regulation networks. In this article, we present a structure-based method for computational prediction of TFBSs using a novel, integrative ener...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 23 23 شماره
صفحات -
تاریخ انتشار 2007